智能论文笔记

A Concurrent CNN-RNN Approach for Multi-Step Wind Power Forecasting

Syed Kazmi , Berk Gorgulu , Mucahit Cevik , Mustafa Gokce Baydogan

分类：机器学习

2023-01-02

Wind power forecasting helps with the planning for the power systems by contributing to having a higher level of certainty in decision-making. Due to the randomness inherent to meteorological events (e.g., wind speeds), making highly accurate long-term predictions for wind power can be extremely difficult. One approach to remedy this challenge is to utilize weather information from multiple points across a geographical grid to obtain a holistic view of the wind patterns, along with temporal information from the previous power outputs of the wind farms. Our proposed CNN-RNN architecture combines convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to extract spatial and temporal information from multi-dimensional input data to make day-ahead predictions. In this regard, our method incorporates an ultra-wide learning view, combining data from multiple numerical weather prediction models, wind farms, and geographical locations. Additionally, we experiment with global forecasting approaches to understand the impact of training the same model over the datasets obtained from multiple different wind farms, and we employ a method where spatial information extracted from convolutional layers is passed to a tree ensemble (e.g., Light Gradient Boosting Machine (LGBM)) instead of fully connected layers. The results show that our proposed CNN-RNN architecture outperforms other models such as LGBM, Extra Tree regressor and linear regression when trained globally, but fails to replicate such performance when trained individually on each farm. We also observe that passing the spatial information from CNN to LGBM improves its performance, providing further evidence of CNN's spatial feature extraction capabilities.

translated by 谷歌翻译

Lisan: Yemenu, Irqi, Libyan, and Sudanese Arabic Dialect Copora with Morphological Annotations

Mustafa Jarrar , Fadi A Zaraket , Tymaa Hammouda , Daanish Masood Alavi , Martin Waahlisch

分类：自然语言处理

2022-12-13

This article presents morphologically-annotated Yemeni, Sudanese, Iraqi, and Libyan Arabic dialects Lisan corpora. Lisan features around 1.2 million tokens. We collected the content of the corpora from several social media platforms. The Yemeni corpus (~ 1.05M tokens) was collected automatically from Twitter. The corpora of the other three dialects (~ 50K tokens each) came manually from Facebook and YouTube posts and comments. Thirty five (35) annotators who are native speakers of the target dialects carried out the annotations. The annotators segemented all words in the four corpora into prefixes, stems and suffixes and labeled each with different morphological features such as part of speech, lemma, and a gloss in English. An Arabic Dialect Annotation Toolkit ADAT was developped for the purpose of the annation. The annotators were trained on a set of guidelines and on how to use ADAT. We developed ADAT to assist the annotators and to ensure compatibility with SAMA and Curras tagsets. The tool is open source, and the four corpora are also available online.

translated by 谷歌翻译

Metric Learning as a Service with Covariance Embedding

Imam Mustafa Kamal , Hyerim Bae , Ling Liu

分类：计算机视觉

2022-11-28

With the emergence of deep learning, metric learning has gained significant popularity in numerous machine learning tasks dealing with complex and large-scale datasets, such as information retrieval, object recognition and recommendation systems. Metric learning aims to maximize and minimize inter- and intra-class similarities. However, existing models mainly rely on distance measures to obtain a separable embedding space and implicitly maximize the intra-class similarity while neglecting the inter-class relationship. We argue that to enable metric learning as a service for high-performance deep learning applications, we should also wisely deal with inter-class relationships to obtain a more advanced and meaningful embedding space representation. In this paper, a novel metric learning is presented as a service methodology that incorporates covariance to signify the direction of the linear relationship between data points in an embedding space. Unlike conventional metric learning, our covariance-embedding-enhanced approach enables metric learning as a service to be more expressive for computing similar or dissimilar measures and can capture positive, negative, or neutral relationships. Extensive experiments conducted using various benchmark datasets, including natural, biomedical, and facial images, demonstrate that the proposed model as a service with covariance-embedding optimizations can obtain higher-quality, more separable, and more expressive embedding representations than existing models.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

A Survey on Computer Vision based Human Analysis in the COVID-19 Era

Fevziye Irem Eyiokur , Alperen Kantarcı , Mustafa Ekrem Erakın , Naser Damer , Ferda Ofli , Muhammad Imran , Janez Križaj , Albert Ali Salah , Alexander Waibel , Vitomir Štruc

分类：计算机视觉

2022-11-07

The emergence of COVID-19 has had a global and profound impact, not only on society as a whole, but also on the lives of individuals. Various prevention measures were introduced around the world to limit the transmission of the disease, including face masks, mandates for social distancing and regular disinfection in public spaces, and the use of screening applications. These developments also triggered the need for novel and improved computer vision techniques capable of (i) providing support to the prevention measures through an automated analysis of visual data, on the one hand, and (ii) facilitating normal operation of existing vision-based services, such as biometric authentication schemes, on the other. Especially important here, are computer vision techniques that focus on the analysis of people and faces in visual data and have been affected the most by the partial occlusions introduced by the mandates for facial masks. Such computer vision based human analysis techniques include face and face-mask detection approaches, face recognition techniques, crowd counting solutions, age and expression estimation procedures, models for detecting face-hand interactions and many others, and have seen considerable attention over recent years. The goal of this survey is to provide an introduction to the problems induced by COVID-19 into such research and to present a comprehensive review of the work done in the computer vision based human analysis field. Particular attention is paid to the impact of facial masks on the performance of various methods and recent solutions to mitigate this problem. Additionally, a detailed review of existing datasets useful for the development and evaluation of methods for COVID-19 related applications is also provided. Finally, to help advance the field further, a discussion on the main open challenges and future research direction is given.

translated by 谷歌翻译

Generalisability of deep learning models in low-resource imaging settings: A fetal ultrasound study in 5 African countries

Carla Sendra-Balcells , Víctor M. Campello , Jordina Torrents-Barrena , Yahya Ali Ahmed , Mustafa Elattar , Benard Ohene Botwe , Pempho Nyangulu , William Stones , Mohammed Ammar , Lamya Nawal Benamer

分类：计算机视觉

2022-09-20

大多数人工智能（AI）研究都集中在高收入国家，其中成像数据，IT基础设施和临床专业知识丰富。但是，在需要医学成像的有限资源环境中取得了较慢的进步。例如，在撒哈拉以南非洲，由于获得产前筛查的机会有限，围产期死亡率的率很高。在这些国家，可以实施AI模型，以帮助临床医生获得胎儿超声平面以诊断胎儿异常。到目前为止，已经提出了深度学习模型来识别标准的胎儿平面，但是没有证据表明它们能够概括获得高端超声设备和数据的中心。这项工作研究了不同的策略，以减少在高资源临床中心训练并转移到新的低资源中心的胎儿平面分类模型的域转移效果。为此，首先在丹麦的一个新中心对1,008例患者的新中心进行评估，接受了1,008名患者的新中心，后来对五个非洲中心（埃及，阿尔及利亚，乌干达，加纳和马拉维进行了相同的表现），首先在丹麦的一个新中心进行评估。）每个患者有25名。结果表明，转移学习方法可以是将小型非洲样本与发达国家现有的大规模数据库相结合的解决方案。特别是，该模型可以通过将召回率提高到0.92 \ pm 0.04 $，同时又可以维持高精度。该框架显示了在临床中心构建可概括的新AI模型的希望，该模型在具有挑战性和异质条件下获得的数据有限，并呼吁进行进一步的研究，以开发用于资源较少的国家 /地区的AI可用性的新解决方案。

translated by 谷歌翻译

PaLI: A Jointly-Scaled Multilingual Language-Image Model

Xi Chen , Xiao Wang , Soravit Changpinyo , AJ Piergiovanni , Piotr Padlewski , Daniel Salz , Sebastian Goodman , Adam Grycner , Basil Mustafa , Lucas Beyer

分类：计算机视觉 | 自然语言处理

2022-09-14

有效的缩放和灵活的任务接口使大型语言模型能够在许多任务中表现出色。帕利（Pali）根据视觉和文本输入生成文本，并使用该界面以许多语言执行许多视觉，语言和多模式任务。为了训练帕利，我们利用了大型的编码器语言模型和视觉变压器（VITS）。这使我们能够利用其现有能力，并利用培训它们的大量成本。我们发现，视觉和语言组成部分的联合缩放很重要。由于现有的语言变压器比其视觉对应物要大得多，因此我们训练迄今为止最大的VIT（VIT-E），以量化甚至大容量视觉模型的好处。为了训练Pali，我们基于一个新的图像文本训练集，其中包含10B图像和文本，以100多种语言来创建大型的多语言组合。帕利（Pali）在多个视觉和语言任务（例如字幕，视觉问题，索方式，场景文本理解）中实现了最新的，同时保留了简单，模块化和可扩展的设计。

translated by 谷歌翻译

Masader Plus: A New Interface for Exploring +500 Arabic NLP Datasets

Yousef Altaher , Ali Fadel , Mazen Alotaibi , Mazen Alyazidi , Mishari Al-Mutairi , Mutlaq Aldhbuiub , Abdulrahman Mosaibah , Abdelrahman Rezk , Abdulrazzaq Alhendi , Mazen Abo Shal

分类：自然语言处理

2022-08-01

Masader（Alyafeai等，2021）创建了一种元数据结构，用于分类阿拉伯NLP数据集。但是，开发一种简单的方法来探索这种目录是一项艰巨的任务。为了为探索目录的用户和研究人员提供最佳体验，必须解决一些设计和用户体验的挑战。此外，用户与网站的交互可能提供了一种简单的方法来改善目录。在本文中，我们介绍了Masader Plus，该网络接口供用户浏览masader。我们演示了数据探索，过滤和简单的API，该API允许用户从后端检查数据集。可以使用此链接https://arbml.github.io/masader探索masader plus。可以在此处找到的视频录制说明界面的录制https://www.youtube.com/watch?v=setDlseqchk。

translated by 谷歌翻译

Theseus: A Library for Differentiable Nonlinear Optimization

Luis Pineda , Taosha Fan , Maurizio Monge , Shobha Venkataraman , Paloma Sodhi , Ricky Chen , Joseph Ortiz , Daniel DeTone , Austin Wang , Stuart Anderson

分类：机器人 | 计算机视觉 | 机器学习

2022-07-19

我们提出了Theseus，这是一个有效的应用程序不合时宜的开源库，用于在Pytorch上构建的可区分非线性最小二乘（DNL）优化，为机器人技术和视觉中的端到端结构化学习提供了一个共同的框架。现有的DNLS实施是特定应用程序的，并且并不总是纳入许多对效率重要的成分。 Theseus是应用程序不可静止的，正如我们使用的几个示例应用程序所用的，这些应用程序是使用相同的基础可区分组件构建的，例如二阶优化器，标准成本功能和Lie组。为了提高效率，TheseUS纳入了对稀疏求解器，自动矢量化，批处理，GPU加速度和梯度计算的支持，并具有隐式分化和直接损耗最小化。我们在一组应用程序中进行了广泛的性能评估，显示出这些功能时显示出明显的效率提高和更好的可扩展性。项目页面：https：//sites.google.com/view/theseus-ai

translated by 谷歌翻译

A description of Turkish Discourse Bank 1.2 and an examination of common dependencies in Turkish discourse

Deniz Zeyrek , Mustafa Erolcan Er

分类：自然语言处理

2022-07-11

我们描述了土耳其话语银行1.2，这是一个最新版本的话语语料库，以明确或隐式传达的话语关系，其本构单元以及宾夕法尼亚州话语bank treebank风格的感觉。我们介绍了最近添加的令牌的评估，并检查了三种通常发生的依赖模式，这些模式在一对相邻话语关系的本构单元之间存在，即共同的参数，完整的嵌入和对话语关系的部分遏制。我们提出了三个主要发现：（a）隐式传达的关系发生的频率比数据中明确传达的关系更频繁；（b）两个相邻的隐式话语关系分享一个论点比对两个相邻的显式关系更为普遍；（c）语料库中普遍存在的完全嵌入和部分围绕话语关系是普遍存在的，这可能部分是由于下属连接剂，其预先的下属子句倾向于与矩阵子句一起选择，而不是单独选择。最后，我们简要讨论了我们发现对土耳其话语解析的含义。

translated by 谷歌翻译